AITopics | euler method

Collaborating Authors

euler method

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Discrete Diffusion Models: Novel Analysis and New Sampler Guarantees

Neural Information Processing SystemsJun-23-2026, 02:26:58 GMT

Discrete diffusion models have recently gained significant prominence in applications involving natural language and graph data. A key factor influencing their effectiveness is the efficiency of discretized samplers. Among these, τ-leaping samplers have become particularly popular due to their theoretical and empirical success. However, existing theoretical analyses of τ-leaping often rely on somewhat restrictive and difficult-to-verify regularity assumptions, and their convergence bounds contain quadratic dependence on the vocabulary size. In this work, we introduce a new analytical approach for discrete diffusion models that removes the need for such assumptions. For the standard τ-leaping method, we establish convergence guarantees in KL divergence that scale linearly with vocabulary size, improving upon prior results with quadratic dependence. Our approach is also more broadly applicable: it provides the first convergence guarantees for other widely used samplers, including the Euler method and Tweedie τ-leaping. Central to our approach is a novel technique based on differential inequalities, offering a more flexible alternative to the traditional Girsanov change-of-measure methods. This technique may also be of independent interest for the analysis of other stochastic processes.

large language model, machine learning, natural language, (19 more...)

Neural Information Processing Systems

Country: North America > United States (0.67)

Genre: Research Report > Experimental Study (1.00)

Industry: Government (0.93)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.67)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.46)

Add feedback

Sharp Convergence Rates for Masked Diffusion Models

Liang, Yuchen, Tan, Zhiheng, Shroff, Ness, Liang, Yingbin

arXiv.org Machine LearningFeb-27-2026

Discrete diffusion models have achieved strong empirical performance in text and other symbolic domains, with masked (absorbing-rate) variants emerging as competitive alternatives to autoregressive models. Among existing samplers, the Euler method remains the standard choice in many applications, and more recently, the First-Hitting Sampler (FHS) has shown considerable promise for masked diffusion models. Despite their practical success, the theoretical understanding of these samplers remains limited. Existing analyses are conducted in Kullback-Leibler (KL) divergence, which often yields loose parameter dependencies and requires strong assumptions on score estimation. Moreover, these guarantees do not cover recently developed high-performance sampler of FHS. In this work, we first develop a direct total-variation (TV) based analysis for the Euler method that overcomes these limitations. Our results relax assumptions on score estimation, improve parameter dependencies, and establish convergence guarantees without requiring any surrogate initialization. Also for this setting, we provide the first convergence lower bound for the Euler sampler, establishing tightness with respect to both the data dimension $d$ and the target accuracy $\varepsilon$. Finally, we analyze the FHS sampler and show that it incurs no sampling error beyond that induced by score estimation, which we show to be tight with a matching lower error bound. Overall, our analysis introduces a direct TV-based error decomposition along the CTMC trajectory and a decoupling-based path-wise analysis for FHS, which may be of independent interest.

artificial intelligence, diffusion model, machine learning, (17 more...)

arXiv.org Machine Learning

2602.22505

Country:

North America > United States > Ohio (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report > New Finding (0.34)

Industry: Government (0.67)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Discrete Diffusion Models: Novel Analysis and New Sampler Guarantees

Liang, Yuchen, Liang, Yingbin, Lai, Lifeng, Shroff, Ness

arXiv.org Artificial IntelligenceNov-3-2025

Discrete diffusion models have recently gained significant prominence in applications involving natural language and graph data. A key factor influencing their effectiveness is the efficiency of discretized samplers. Among these, $τ$-leaping samplers have become particularly popular due to their theoretical and empirical success. However, existing theoretical analyses of $τ$-leaping often rely on somewhat restrictive and difficult-to-verify regularity assumptions, and their convergence bounds contain quadratic dependence on the vocabulary size. In this work, we introduce a new analytical approach for discrete diffusion models that removes the need for such assumptions. For the standard $τ$-leaping method, we establish convergence guarantees in KL divergence that scale linearly with vocabulary size, improving upon prior results with quadratic dependence. Our approach is also more broadly applicable: it provides the first convergence guarantees for other widely used samplers, including the Euler method and Tweedie $τ$-leaping. Central to our approach is a novel technique based on differential inequalities, offering a more flexible alternative to the traditional Girsanov change-of-measure methods. This technique may also be of independent interest for the analysis of other stochastic processes.

machine learning, natural language, sampler, (17 more...)

arXiv.org Artificial Intelligence

2509.16756

Country: North America > United States (0.67)

Genre: Research Report > Promising Solution (0.34)

Industry: Government (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Stable neural networks and connections to continuous dynamical systems

Ehrhardt, Matthias J., Murari, Davide, Sherry, Ferdia

arXiv.org Artificial IntelligenceOct-28-2025

The existence of instabilities, for example in the form of adversarial examples, has given rise to a highly active area of research concerning itself with understanding and enhancing the stability of neural networks. We focus on a popular branch within this area which draws on connections to continuous dynamical systems and optimal control, giving a bird's eye view of this area. We identify and describe the fundamental concepts that underlie much of the existing work in this area. Following this, we go into more detail on a specific approach to designing stable neural networks, developing the theoretical background and giving a description of how these networks can be implemented. We provide code that implements the approach that can be adapted and extended by the reader. The code further includes a notebook with a fleshed-out toy example on adversarial robustness of image classification that can be run without heavy requirements on the reader's computer. We finish by discussing this toy example so that the reader can interactively follow along on their computer. This work will be included as a chapter of a book on scientific machine learning, which is currently under revision and aimed at students.

artificial intelligence, machine learning, neural network, (18 more...)

arXiv.org Artificial Intelligence

2510.22299

Country: North America > United States > California (0.28)

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

IIET: Efficient Numerical Transformer via Implicit Iterative Euler Method

Liu, Xinyu, Li, Bei, Liu, Jiahao, Ruan, Junhao, Jiao, Kechen, Tang, Hongyin, Wang, Jingang, Tong, Xiao, Zhu, Jingbo

arXiv.org Artificial IntelligenceOct-14-2025

High-order numerical methods enhance Transformer performance in tasks like NLP and CV, but introduce a performance-efficiency trade-off due to increased computational overhead. Our analysis reveals that conventional efficiency techniques, such as distillation, can be detrimental to the performance of these models, exemplified by PCformer. To explore more optimizable ODE-based Transformer architectures, we propose the Iterative Implicit Euler Transformer (IIET), which simplifies high-order methods using an iterative implicit Euler approach. This simplification not only leads to superior performance but also facilitates model compression compared to PCformer. To enhance inference efficiency, we introduce Iteration Influence-Aware Distillation (IIAD). Through a flexible threshold, IIAD allows users to effectively balance the performance-efficiency trade-off. On lm-evaluation-harness, IIET boosts average accuracy by 2.65% over vanilla Transformers and 0.8% over PCformer. Its efficient variant, E-IIET, significantly cuts inference overhead by 55% while retaining 99.4% of the original task accuracy. Moreover, the most efficient IIET variant achieves an average performance gain exceeding 1.6% over vanilla Transformer with comparable speed.

iiet, large language model, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2509.22463

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

drawing connections to Feldman's work (L36), but we agree that the relation between the three topics should be

Neural Information Processing SystemsAug-19-2025, 23:22:35 GMT

Thank you all for your thoughtful comments; we address your concerns below. The MDL principle formalizes Occam's razor and is a We will add the discussion of such relevant studies to section 1. We will add these results and accompanying visualizations to appendix. Model (solver) MAC DAFT MAC (euler) DAFT MAC (rk4) DAFT MAC (dopri5; used in training)Time (ms) 153. We found that during evaluation, rk4 solves all the dynamics generated from CLEVR dataset.

daft mac, dataset, mac, (14 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory > Minimum Complexity Machines (0.51)

Add feedback

b21f9f98829dea9a48fd8aaddc1f159d-Supplemental.pdf

Neural Information Processing SystemsAug-16-2025, 22:41:52 GMT

artificial intelligence, dataset, machine learning, (18 more...)

Neural Information Processing Systems

Country:

North America > United States > New York (0.04)
North America > United States > New Jersey (0.04)
Asia > Japan (0.04)

Genre: Research Report (0.46)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (0.93)
Materials > Chemicals (0.93)
Health & Medicine > Therapeutic Area > Immunology (0.68)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.49)

Add feedback

From Text to Trajectories: GPT-2 as an ODE Solver via In-Context

Ma, Ziyang, Zhou, Baojian, Yang, Deqing, Xiao, Yanghua

arXiv.org Artificial IntelligenceAug-6-2025

In-Context Learning (ICL) has emerged as a new paradigm in large language models (LLMs), enabling them to perform novel tasks by conditioning on a few examples embedded in the prompt. Yet, the highly nonlinear behavior of ICL for NLP tasks remains poorly understood. To shed light on its underlying mechanisms, this paper investigates whether LLMs can solve ordinary differential equations (ODEs) under the ICL setting. We formulate standard ODE problems and their solutions as sequential prompts and evaluate GPT-2 models on these tasks. Experiments on two types of ODEs show that GPT-2 can effectively learn a meta-ODE algorithm, with convergence behavior comparable to, or better than, the Euler method, and achieve exponential accuracy gains with increasing numbers of demonstrations. Moreover, the model generalizes to out-of-distribution (OOD) problems, demonstrating robust extrapolation capabilities. These empirical findings provide new insights into the mechanisms of ICL in NLP and its potential for solving nonlinear numerical problems.

large language model, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2508.03031

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Numerical Artifacts in Learning Dynamical Systems

Lu, Bing-Ze, Tsai, Richard

arXiv.org Artificial IntelligenceJul-29-2025

In many applications, one needs to learn a dynamical system from its solutions sampled at a finite number of time points. The learning problem is often formulated as an optimization problem over a chosen function class. However, in the optimization procedure, it is necessary to employ a numerical scheme to integrate candidate dynamical systems and assess how their solutions fit the data. This paper reveals potentially serious effects of a chosen numerical scheme on the learning outcome. In particular, our analysis demonstrates that a damped oscillatory system may be incorrectly identified as having "anti-damping" and exhibiting a reversed oscillation direction, despite adequately fitting the given data points.

artificial intelligence, machine learning, optimization problem, (15 more...)

arXiv.org Artificial Intelligence

2507.14491

Country: North America > United States (0.46)

Genre: Research Report > New Finding (0.46)

Industry: Education (0.34)

Technology:

Information Technology > Artificial Intelligence > Robots (0.92)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.86)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Learning Null Geodesics for Gravitational Lensing Rendering in General Relativity

Sun, Mingyuan, Fang, Zheng, Wang, Jiaxu, Zhang, Kunyi, Zhang, Qiang, Xu, Renjing

arXiv.org Artificial IntelligenceJul-22-2025

We present GravLensX, an innovative method for rendering black holes with gravitational lensing effects using neural networks. The methodology involves training neural networks to fit the spacetime around black holes and then employing these trained models to generate the path of light rays affected by gravitational lensing. This enables efficient and scalable simulations of black holes with optically thin accretion disks, significantly decreasing the time required for rendering compared to traditional methods. We validate our approach through extensive rendering of multiple black hole systems with superposed Kerr metric, demonstrating its capability to produce accurate visualizations with significantly $15\times$ reduced computational time. Our findings suggest that neural networks offer a promising alternative for rendering complex astrophysical phenomena, potentially paving a new path to astronomical visualization.

artificial intelligence, black hole, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2507.15775

Country: Asia > China (0.46)

Genre: Research Report > New Finding (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)

Add feedback